ON AN ALGORITHM FOR MULTIPERIODIC WORDS 



STEP AN HOLUB 



Abstract. We consider an algorithm by Tijdeman and Zamboni construct- 
ing a word of length n that has periods pi, . . . , p r , and the richest possible 
alphabet. We show that this algorithm can be easily stated and its correctness 
briefly proved using the class equivalence approach. 



A SHORT (PERSONAL) HISTORY 

Non-trivial words with a given set of periods P = {pi,P2, ■ ■ ■ ,Pr} have received 
a lot of attention in the past decade. The motivation was to generalize the result by 
Fine and Wilf dealing with two periods, which became part of the folklore. A word 
with periods P is called trivial if gcd(P) is its period too. Papers Q] and [I] are two 
(independent) results considering non-trivial such words with maximal length and 
maximal cardinality of its alphabet. Those papers completed some older research of 
Castelli, Mignosi, Restivo and Justin (see, e.g., [5] for more details and references). 
Already in 1998, I wrote a short manuscript giving an analogous result (without 
considering its publication), which I showed to Sorin Constantinescu during the 
conference WORDS 2003 in Turku, where he presented their results. Since this 
was passed without notice in the subsequent publication and since I considered my 
approach simpler and more natural, I later decided to publish it in 2 . There was 
a gap in my paper, discovered by Gwenael Richomme, which is fixed in [3J. 

The present paper extends the same approach to the construction of the richest 
word with a given set of periods and a given length. The basic idea is to consider 
relations defined by the periods and understand letters as (names of) equivalence 
classes generated by those relations. The idea is obvious and well known, usually 
expressed using the graph terminology (edges and connected components), rather 
than the algebraic terminology (relations and equivalence classes). Tijdeman and 
Zamboni [5] point out that the straightforward algorithm based on the graph ap- 
proach is "simple but inefficient" and then present an algorithm based on less 
transparent combinatorial analysis. The aim of this paper is to give a short de- 
scription of their algorithm, as well as a short and intuitive proof of its correctness, 
using consistently the graph/equivalence viewpoint. 

1. Notation 

Let w be a word of length n over an alphabet A. The set of all letters that occur 
in w is denoted by alph(u>). The i-th letter of w is denoted by w[i — 1] so that 
w = w[0}w[l) ■ • ■ w\n — 1]. The prefix of w of length k is denoted by pref fe (w). 

We say that a positive integer p is a period of a word w if w[i] = w[i + p] for 
all < i < \w\ — i — 1 (where \w\ denotes the length of the word). Note that any 
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p > \w\ is a period of u. If P is a set of positive integers such that each p g P is a 
period of u>, we say that w has periods P. 

The word of length n having periods P and the maximal possible cardinality of 
alph(w) is called an FW-word relative to P (where FW stands for "Fine and Wilf 
for historic reasons). The word is called trivial with respect to P if gcd(P) is a 
period of w. The longest non-trivial FW-word relative to P is called an extremal 
FW-word relative to P. We denote its length by C(P) (note that C{P) = L(P) — 1 
where L(P) is the notation adopted in [S]). 

2. Classes of Equivalence 

Let w be a word which has periods P. For the rest of the paper we denote 
m = minP. Obviously, if i = j modm or \i — j\ G P, then w[i] = w[j]. These two 
conditions induce the relation ~pfc on integers {0, . . . , k — 1} defined by: 

i ~p,fc j if 

• i = j modm, or 

• there are integers G {0, . . . , k — 1} such that 

i = i mod m, j = j' mod m 

and 

\i'-f\eR 

Let «p,fc be the equivalence closure of In other words, we have i ~p^k j if and 

only if i and j lie in the same connected component of the graph defined by edges 
* ^p,k j- The class of ~p_k containing i will be denoted by [i]p,k and represented 
by its minimal element min^Jpfc. Then we obtain a word FW(P, k) of length k over 
the alphabet N by 

FW(P,fc)[i] =min[i] P , fc . 

The construction immediately yields that FW(P, k) is the unique (up to renaming 
of letters) FW-word of length k relative to P. 

3. The algorithm 

The basic step of the algorithm is the reduction of P to a new set of periods Q 
defined by 

(1) Q — {p — m | p G P,p 7^ m} U {to} 

(where to = minP according to our convention). This reduction is, in fact, one step 
in the Eucledean algorithm, and is well known in the literature on multiperiodic 
words. The key fact about P and Q is expressed in the following lemma, which is 
an improved version of Lemma 2 from [2] . 

Lemma 1. Let k > 0. Then for all i,j G {0, 1, . . . , k} 

[i]Q,k = [j}QM if and only if [i]p, fc+m = \j]p,k+m- 

Proof. : If [i]Q.k — [i]<3,fc> then there is a sequence i = io, . . . , in = j, of numbers 
from {0, 1, . . . , k — 1} such that 

is ^Q,k i-s+1 

for each s = 0, ...,£ — 1. The relation i s ^Q,fc i s +i implies i s ~p,fc+ m i s +i, since 
either 
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• i s = i s+ i mod m, or 

• max{i s , i s+1 } + to - vam{i s ,i s+ i} G P 

Therefore [i]p,k+m = \j]p,k+m- 

"<^" : On the other hand, let i = i§, . . . , ii = j, be a sequence of numbers from 
{0, . . . , k + m — 1}, with i, j s {0, . . . , k — 1}, such that 

for each s = 0, ...,£— 1. Certainly, we can suppose that the numbers in the 
sequence are pairwise distinct, whence \i s — i s +i \ > m and both min{i s ,i s +i} and 
max{i s , i s +i} — m are in {0, 1, . . . , k — 1}. We now see that 

max{i s ,i s+ i} - m ~ Q)fc min{i s , i a+ i}. 

Therefore the sequence 

i = io,(io mod m) , . . . , {it mod m),i( = j 

proves = [?']Q,fc. □ 

We have an immediate corollary. 

Corollary 1. For any k > 0, the word FW(Q, fc) is a prefix o/FW(P, fc + m). 

The following lemma is an easy observation. 

Lemma 2. Let n — m < i < m — 1. TTien [?]p,n = {*}• 

Proof. Both i— p and i+p are out of range {0, 1, . . . , n — 1} for any p 6 P (including 
to). Therefore i is not related by <^p,k to any other element. □ 

From Corollary Q] and Lemma [21 the formula 

Cp — to + max{£g, to — 1} 

can be readily derived (see [21 13)- In addition, it yields the following construction 
of FW(P, n), equivalent to Algorithm B described in [5]. 

(1) If n < to, then Lemma [21 with fc = gives 

FW(P,n) = • 1 • • • (n- 1). 

(Recall that we consider integers as letters. To stress that, we use the 
typewriter font for them. The multiplication sign means concatenation). 

(2) Let n > to. Since the word FW(P, n) has a period to, it is determined by its 
prefix w of length to. Denote u = FW(Q, n — to). Corollary [T] and Lemma [2] 
imply that 

• w = pref m (u) if to < n — to, and 

• w = u ■ |u| • (|u| + l) • • • (m — 1) otherwise. 
This can be succinctly stated as: 



FW(P,n)[i] 



FW(Q, n — to) [i mod to] if {i mod to) < n — m 
i mod to otherwise. 



Example. Let P = {5, 7} and n = 8. Recursive definition of FW(P, 8) leads to 
P = Qo = {5,7} n = n Q = 8 

Oi = {2,5} ni=3 
Q 2 = {2,3} na = l 
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In order to obtain the word 



«o = FW(Q ,n ) =FW(P,8) 

we will need words 

ui = FW(Qi, ni) and u 2 = FW(<2 2 , n 2 ). 

Since n 2 — 1, we have u 2 = 0. From the point (2) above we have 

u\ = pref 3 (iu 1 ') where w\ = 01. 

Therefore u\ = 010. Similarly, we get 

uq = pref 8 (wo ) where w = 01034, 

whence FW(F, 8) = 01034010. 
Schematically: 



Qo = {5,7} n = 



Qi = {2,5} 



Q 2 = {2,3} * 7*2 = 1 



M = 01034010 i w = 01034 



rii = 3 



u\ = 010 <- 



wi = 01 



□ 



From the above example we see that the procedure has two parts: "descending" 
and "ascending", which are called "Reduction" and "Extension" in 15,. The end 
of reduction can be defined in several ways. We have seen that we can turn to 
extension as soon as we know FW(Qi, rij). This typically happens if rii < minQi, or 
if minQj = gcd(Q l ). 

4. Concluding remarks 

As already remarked, the above algorithm is identical with Algorithm B from [5] . 
Even all arguments we use can be in some way traced back to similar arguments in 
literature. Nevertheless, I believe that the description presented here gives another 
evidence to the fact that the equivalence class approach is not only simple but also 
efficient and intuitive. (Another elegant example, in my opinion, is the proof of the 
fact that the extremal FW-word is a palindrome, given in [2].) 

One possible drawback can be a bit discouraging notation like ~p,fe, and the 
fact that notions like "equivalence closure" may sound "too algebraic" to some 
ears. Computer theorists could therefore like to translate the exposition into graph 
language and speak about edges instead of generating relations and about connected 
components instead of equivalence classes. The rest will be the same. 
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